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(54) Method for estimating the noise level in a video sequence 

(57) In noise measurement for video sequences it is 
difficult to distinguish between picture content and 
noise. In order to improve the measurement reliability 
the results (<r p1i a^, o^) of two different noise level 
computing methods are combined. One computation 
relies on the analysis of displaced field or frame differ- C 
ences (DFD), the other is based on the values of the 
field or frame differences (FD) over static picture areas. 
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Descripti n 

[0001 ] The invention relates to a method for estimating the noise level in a video sequence. 
5 Background 

[0002] EP-A-0562407 discloses a method of noise measurement in conjunction with a block-matching motion estima- 
tion algorithm, the principle of which is to derive a noise level from the minimum of accumulated absolute pixel differ- 
ence values, leading to a displaced field or frame differences (DFD) value, the accumulation taking place over 

10 predetermined pixel blocks. 

A paper by Q. Zhang and R. Ward, entitled "Automatic assessment of signal-to-thermal noise ratio of television 
images", Vol. 41, No. 1 , IEEE Transactions on Consumer Electronics (February 1995), discloses a method for measur- 
ing the noise level from the TV pictures as such. This method is based on the application of a two-dimensional htghpass 
filter on the images in order to remove the majority of the (non-noisy) image content Thereafter the smoothest regions 

is of the picture, i.e. those having minimum energy with respect to brightness variations, are selected and the noise power 
is estimated from their remaining average power. 

That paper says that in digital image processing the customary procedure to estimate the level of thermal noise in the 
image is to analyse "smooth regions, i.e. regions containing constant luminance (grey levels)". 

20 Invention 

[0003] The method described in EP-A-0562407 lacks robustness because it is based solely on the minimum of the 
distribution of estimates for each block of the picture, and therefore depends on the shape and deviation of this distri- 
bution. The method described by Zhang et at. suffers from the same shortcoming, as the computation of the noise level 

25 is eventually based on the low-end tail of the distribution of noise energies over subimages of a picture. Thus, for pic- 
tures with few areas of high spatial frequencies, there is a risk of under-estimating the noise level. 
The proposed method alleviates this problem by biasing the estimation largely on averages, rather than minima, of 
noise energy measurements. The measurement performed over static areas, in particular, is independent of the spatial 
frequency contents of the picture. 

30 [0004] It is one object of the invention to disclose a method for more reliable noise estimation. This object is achieved 
by the method disclosed in claim 1 . 

[0005] In the invention, additional motion information provided by e.g. a motion-compensated interpolation is used in 
order to compute a more robust and accurate estimation of the noise level in a video sequence. Ideally, if motion esti- 
mation is error-free, the remaining differences between the grey levels of input pixels from the two source picture blocks 

35 put in correspondence by an estimated motion vector must be the result of noise. 

The additional motion information may also be derived from motion vector information of an MPEG bitstream. 
[0006] Modifying the field or frame rate of a video sequence by interpolating pictures located temporally between the 
source pictures is required for picture rate upconversion or standards conversion. The best conversion quality is 
achieved if the motion of objects in the source sequence is estimated and used to interpolate each pixel along the direc- 

40 tion of its associated motion vector. Another application of this technique is noise reduction by means of a temporal fil- 
ter, with the goal of improving either the picture quality or the coding efficiency, e.g. of MPEG2 encoders. 
Motion estimation can be performed by finding the vectors that provide the best match between pixels or blocks of pixels 
mapped from a previous or current picture to a next picture. The mathematical criterion used for the selection of a 
motion vector is usually the minimisation of the sum of the absolute values of the displaced field difference or displaced 

45 frame difference of a pixel block, as described in Fig. 1 . An intermediate field or frame IF to be interpolated is located 
temporally between a previous field or frame PF and a next f ield or frame NF. 

The temporal distance between PF and NF is T, between PF and IF a*T, and between IF and NF (1 -a)*T. 
The zero vector 0 « (0,0) passes through points |P(x,y) in PF, l(x,y) in IF, and l n (x,y) in NF. A current candidate motion 
vector y = (v x , v y ) passes through points l p (x-a*v x , y-a*v y ) in PE, l(x,y) in IF, and l n (x+(1-a)*v x , y+(1-a)*v y ) in NF. 
so The frame difference (for vector 0) is 

FD = i n (x,y)-« p (x.y)- 

The displaced frame difference for vector y is 

55 

DED(y) = l n (x+(1-a)V x , y+(1-a)*v y ) - l p (x-a*v x . y-a*v y ) . 
[0007] The interpolation of the output pictures is carried out along the direction of the estimated motion vectors. The 
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quality of the interpolation is limited by the accuracy of the motion vectors, except in static parts of the pictures where 
the motion is known to be exactly zero. It is therefore advantageous to detect static areas in the source images and to 
implement a specific interpolation mode for moving pixels, thereby optimising the interpolation output resolution. A spe- 
cific solution for detecting such static areas is disclosed in another application of the applicant, internal reference 
s PF980013, filed at the same date. 

[0008] The inv ntive noise level estimation, however, is based on source pictures only. Therefore, rf Fig. 1 is applied 
to the noise level estimation, intermediate field or frame IF is that current source picture for which the noise level is to 
be estimated. 

According to the invention, the results of two different noise level computing methods can be combined in order to 
w improve the reliability of the noise level estimation. One computation relies on the analysis of DFDs, the other is based 
on the values of the field or frame differences (FD) over static areas. 

[0009] The availability of an accurate estimate of the noise level potentially improves the performance of many image 
processing algorithms in the presence of noise because it allows to adapt the algorithm parameters and tresholds to 
that noise level. Applications include: motion estimation, noise reduction, detection of static areas, film mode and film 

15 phase detection, detection of cuts, and many others. 

[0010] In principle, the inventive method is suited for estimating the noise level for a current source field or frame of a 
video sequence, based on the differences between pixel values of blocks in a previous field or frame and corresponding 
pixel values of corresponding blocks in a future field or frame, wherein either said previous or said future field or frame 
can be said current field or frame itself, and wherein at least one block of each corresponding couple of blocks is a 

20 motion-compensated pixel block or is mapped to the other block by an associated motion vector estimate. 

In addition, static picture areas can be determined and the differences between pixel values of blocks in a static picture 
area of a previous field or frame and corresponding pixel values of corresponding blocks in a future field or frame can 
be used to estimate a further noise level estimate which is then combined with said noise level estimate in order to form 
a final noise level estimate, wherein said previous and/or said future field or frame used for the evaluation of said differ- 

25 ences between pixel values of a block in a static picture area can be different from said previous and/or said future field 
or frame used for the evaluation of the differences concerning said motion-compensated pixel blocks or said mapped 
blocks. 

[0011] Advantageous additional embodiments of the inventive method are disclosed in the respective dependent 
claims. 

30 

Drawings 

[0012] Embodiments of the invention are described with reference to the accompanying drawings, which show in: 

35 Fig. 1 picture to be interpolated between and from a previous source picture and a next source picture, or, current 
source picture, between a previous source picture and a next source picture, for which the noise level is to 
be estimated; 

Fig. 2 flow chart for the inventive noise level computation. 
40 Embodiments 

[0013] Input data to the inventive noise level estimation On one field or frame hereafter referred to as the current field 
or frame) include: 

45 - a map of displaced field or frame differences which may be a by-product of a motion estimation; 
a map of the input pixels or blocks of pixels which have been detected as being non-moving; 
a map of field or frame differences, computed between a previous frame and a next frame, located in time repec- 
tively before and after the current frame, if the source images are progressive ones, 

or in case of interlaced source images computed between a previous field and a next field located in time respec- 
50 tively before and after the current field, with the constraint that said previous and next fields have the same parity, 
i.e. that both are top fields or both are bottom fields, 

in both alternatives the previous field or frame or the next field or frame may be the current field or frame; 
the estimate for the noise level derived for a previous source field or frame. 

55 [0014] The computation includes the following steps (cf. Fig. 2): 

a) dividing the current source field or frame into a predetermined raster of FD blocks and int grating th absolute 
values of the FDs over only those FD blocks which are made up exclusively of pixels classified as static in the map 
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of static areas; 

b) translating the resulting block FDs into a first preliminary estimate of the standard deviation of the noise accord- 
ing to a predetermined noise model; 

c) dividing the current field or frame into a predetermined raster of DFD blocks and integrating the absolute values 
5 of the DFDs over these blocks; 

d) translating the resulting block DFDs into a second and a third preliminary estimate of the standard deviation of 
the noise according to a predetermined noise model; 

e) computing a fourth preliminary estimate of the current noise level as a function of the first three preliminary esti- 
mates; 

10 f) filtering this fourth preliminary estimate using the final noise level estimate computed for a previous field or frame 
to provide the final noise level estimate for the current field or frame. 

A noise model assigns to a detected distribution of amounts of pixel differences in a block a corresponding noise level. 
[0015] Ideally if motion estimation is error-free, the remaining differences between the grey levels of input pixels from 

15 the two source picture blocks corresponding to, or mapped by, a motion vector must be the result of noise. The statisti- 
cal distribution of the DFDs thus provides an advantageous starting point for the noise level estimation. 
In practical systems, however, the accuracy of motion estimation is limited by such factors as the finite coding accuracy 
of the vector components, the finite spatial resolution of the source pictures, deviation of the actual scene motion from 
the assumed motion direction model which is usually a translative motion, and unavoidable estimation errors due to 

20 motion analysis failure, e.g. in objects containing periodic structures or in covered/Uncovered areas or in pixel blocks 
containing a static area and smaller moving objects. The resulting motion estimation inaccuracies translate into residual 
DFD terms which add to the noise contribution, and thus bias the estimation of the real noise level. 
[0016] Perfect motion estimation, i.e. with infinite accuracy, is nevertheless achievable on non-moving parts of the 
input sequence, provided that such areas exist in the current picture, which is not the case e.g. during camera panning, 

25 and that a secure method, which may or may not use motion vector information, of detecting these areas is imple- 
mented. Indeed, motion vector components in static areas are exactly zero. As a result, non-displaced frame differ- 
ences or field differences between interlaced fields of identical parity, when computed on a static picture area, provide 
samples of pixel-wise interframe noise signal differences which are unspoiled by any residual terms stemming from 
motion estimation inaccuracies. 

30 [001 7] Advantageously, in the invention these two procedures are combined: one based on DFDs and the other based 
on FDs over static picture areas. 

In situations, e.g. of camera pannings, where all picture pixels are in motion and therefore no noise level estimate can 
be derived from the FDs, advantageously a fallback scheme can be implemented. For example, it can be decided in 
these cases to base the estimation solely on DFD information, or to retain the estimate computed for the previous field 
35 or frame. 

[001 8] In step a) the absolute values of the FD samples are integrated over predetermined pixel blocks in the current 
field or frame, hereafter referred to as FD blocks FDB(ij), which may or may not be overlapping. Only the FD blocks 
which are made up exclusively of pixels classified as non-moving in the map of static areas are used in the estimation 
process. For each of these FD blocks an accumulated frame difference AFD(ij) is computed as the sum of the absolute 

40 values of the FDs associated to the pixels making up the block. 

[0019] It is the purpose of step b) to derive a first preliminary estimate a p1 of the standard deviation of noise, 
expressed in grey levels, from the set of {AFD(ij)}. This computation can be adapted to an a priori noise model. In one 
embodiment of the invention it is assumed that the distribution of the absolute values of the FDs associated with static 
pixels is such that its mean mj FD | is proportional to the standard deviation a of the noise level to be estimated: 

45 m |FD| = /c*a. 

[0020] This assumption holds in particular when the samples of the source noise are spatially and temporally uncor- 
rected and follow a Gaussian distribution, in which case k is found to equal 2/V(ji) s 1 .13 . In one embodiment of the 
invention, k is set to this value. The mathematical expectancy of the AFDs, which can be approximated by the arithmetic 
average i AFD(i.j) I of the AFD(i.j) over the static blocks within the current field or frame, is given by N FDB *mj FD | , where 
so N FDB represents the number of pixels in an FD block. 

[0021 ] A good approximation to a can therefore be derived as: 

c pl = <AFD(i, j) } I U*N FDB ) 

55 

In step c), which is similar to step a), the absolute values of the DFD samples are int grated over predetermined blocks 
in the current field or frame, hereafter referred to as DFD blocks DFDB(i,j). These blocks may or may not be overlapping. 
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For each DFD block DFDB(ij) an accumulated DFD referred to as {ADFD(i j)} is computed as the sum of the absolute 
values of the DFDs associated to the pixels making up the block. 

In step d), which is similar to step b), the set of {ADFD(ij)} is translated into a second a p2 and a third preliminary 
estimate of the standard noise deviation expressed in grey levels. The derivation of is identical to that of a pl with 
the exception that the set of {AFD(ij)} is replaced by the set of {ADFD(ij)}. Let N DFDB be the number of pixels in a DFD 
block and { ADFD(i.j) \ be the average of the ADFDs for the current field or frame. Then a p2 is computed as: 

<*p2 = <(ADFD(i,j) y / (**N DFDB ) 



[0022] Unlike for FDs, however, the estimation of noise level based on DFDs may be biased by residual terms result- 
ing from motion estimation imperfections as explained above. This is likely to occur if the processed fields or frames 
contain areas with high spatial gradients. In order to improve the robustness of the proposed method, a third preliminary 
estimate <jp3 is derived from the minimum rather than the average of the ADFD(ij): 

a p3 = min(ADFD(i, j) ) / (**N DFDB ) 

[0023] In step e) a single preliminary estimate a p is derived from a p1 , a p2 and a^. First, the ratio r = a p2 /a p3 is 
thresholded to determine which preliminary estimates should be used. 

A value of r above a predetermined threshold T r set to a value in the range between "1 " and "5". preferably to the value 
"2" in one embodiment of the invention, indicates a large variety of textures and therefore a significant proportion of high 
gradient areas in the source picture. In that case a p2 is deemed to be unreliable and the preliminary estimate a p is com- 
puted from a p1 and only. 

Conversely, rf r falls to or below Tr, indicating consistency of the estimates computed from the block DFDs, and 
as well as a p1 are used. 
Advantageously a p is derived as: 

a p = ( a p1 + a p3) /2 if a p2 /a p3 > T r 

o p = median{a pl , (a p1 + a p2 )/2, ct^) if a p2 /a p3 < T r 

where medianO means a 3-tap median filter. 
[0024] Since fast variations of the actual noise level in a broadcast image sequence are very unlikely, in step f) a tem- 
poral low-pass filter is applied to a p to further improve the robustness of the noise level estimation. The final estimate a 
of the standard deviation of the noise level is computed from a p and from the noise level estimate a prev of a previous 
frame or field or field of corresponding parity as: 

a = median (a prev - Av low , a p . a prw + Av Ngh ) 

[0025] Av low and Av h jgh are predetermined constants that specify the maximum variations of the estimated noise level 
variance from one estimation cycle (e.g. field or frame) to the next. In one embodiment of the invention Av low and Av hjgh 
are set to about "1 " and about "0.25" grey levels, respectively. 

[0026] The threshold values given in this application are based on an 8-bit representation of the pixel values. If these 
pixel values have a different resolution, the threshold values should be adapted accordingly. 
[0027] It may happen that motion estimation is performed on couples of fields or frames that are not consecutive, in 
which case the current source picture for which the noise level is estimated may differ from the pictures used for motion 
estimation. This is the case e.g. in an MPEG2 encoding scheme if the current frame is a B-frame. 
[0028] One or both of said fields or frames used for determining the pixel value differences FD concerning static pic- 
ture areas can be different from one or both of said fields or frames used for determining the pixel value differences DFD 
concerning couples of motion-compensated blocks or couples of blocks mapped by their associated motion vector. 
[0029] One may use all blocks of the active part of the fields or frames concerned for the nois lev I computation. 
However, it is also possible to not consider pixel blocks which are located at the borders of the active picture part, in 
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particular because the motion information for such blocks may be less reliable. It is also possible to limit further the 
number of blocks considered p rpictur . 

Claims 

5 

1 . Method for estimating the noise level (a^, a^, a) for a current source field or frame (IF) of a video sequence, based 
on the differences (DFD, FD) between pixel values of blocks in a previous field or frame (PF) and corresponding 
pixel values of corresponding blocks in a future field or frame (NF), wherein either said previous (PF) or said future 
(NF) field or frame can be said current field or frame (IF) itself, characterised in that at least one block of each cor- 

10 responding couple of blocks 

is a motion-compensated pixel block or 

is mapped to the other block by an associated motion vector estimate. 

2. Method according to claim 1, wherein in addition static picture areas are determined and the differences (FD) 
15 between pixel values of blocks in a static picture area of a previous field or frame (PF) and corresponding pixel val- 
ues of corresponding blocks in a future field or frame (NF) are used to estimate a further noise level estimate (a p1 ) 
which is combined with said noise level estimate (o^, a^) in order to form a final noise level estimate (a), wherein 
said previous (PF) and/or said future (NF) field or frame used for the evaluation of said differences (FD) between 
pixel values of a block in a static picture area can be different from said previous (PF) and/or said future (NF) field 

20 or frame used for the evaluation of the differences (DFD) concerning said motion-compensated pixel blocks or said 
mapped blocks. 

3. Method according to claim 1 or 2, wherein the amount values of said differences (DFD, FD) between pixel values 
become accumulated (ADFD, AFD) for each block. 

25 

4. Method according to any of claims 1 to 3, wherein said blocks are overlapping. 

5. Method according to any of claims 1 to 4, wherein for said noise level estimate two estimates (a p2 , a^) are calcu- 
lated, wherein the first one (a^) is derived from the average [\ ADFD(ij) }•) of the accumulated block pixel difference 

30 values (ADFD) for the current field or frame and wherein the second one (o^) is derived from the minimum 
(min(ADFD(i,j))) of the accumulated block pixel difference values for the current field or frame. 

6. Method according to claim 5, wherein the final noise level estimate (a) is the mean of said further noise level esti- 
mate (a p1 ) and the second one (ap^ of said noise level estimates (a p2 , apz), if the ratio between the first one (a p2 ) 

35 and the second one (a^) of said noise level estimates is greater than a predetermined threshold [T r ), in particular 
about "2", 

and wherein the final noise level estimate (a) is the median of said further noise level estimate (a p1 ) and of the 
mean of this further noise level estimate (<y p1 ) and the first one (a p2 ) of said noise level estimates and of said sec- 
ond one (a^ of said noise level estimates, if the ratio between the first one (a p2 ) and the second one (a^) of said 
40 noise level estimates is equal to or smaller than said predetermined threshold ( 7^). 

7. Method according to claim 6, wherein said final noise level estimate (ap) becomes median filtered together with a 
noise level estimate for a previous frame or field (opr*,) from which a first predetermined constant (Av !ow ) is sub- 
tracted and with said noise level estimate for a previous frame or field (o^) to which a second predetermined con- 

45 stant (Av hlgh ) is added, in order to form a final output noise level estimate (a). 

8. Method according to claim 7, wherein said first and second predetermined constants (Av low , Av high ) specify the 
maximum variations of the estimated noise level variance from one estimation cycle to the next. 

so 9. Method according to claim 7 or 9, wherein said first predetermined constant (Av low ) has a value of about w 1 ". 

10. Method according to any of claims 7 to 9, wherein said second predetermined constant (Av high ) has a value of 
about "0.25". 

55 11. Method according to any of claims 1 to 10, wher in in situations where all or nearly all pixels of a picture are in 
motion, a fallback noise level estimation is carried out and that estimation 

is based solely on the determination of pixel valu differences (DFD) concerning motion-compensated interpolated 
pixel blocks or blocks mapped by an associated motion vector estimate, or 
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is based on a noise level estimate computed for a previous field or frame. 
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